Skip to content

[WIP] Second pass at LaunchConfig#727

Draft
tpn wants to merge 7 commits intoNVIDIA:mainfrom
tpn:280-launch-config-v2
Draft

[WIP] Second pass at LaunchConfig#727
tpn wants to merge 7 commits intoNVIDIA:mainfrom
tpn:280-launch-config-v2

Conversation

@tpn
Copy link
Contributor

@tpn tpn commented Jan 20, 2026

No description provided.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 20, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Copy link
Contributor

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

This really needs some more explicit motivation in the PR description, as well as some real justification for all the duplicated tooling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure what the purpose of this file is beyond what's happening in the existing test_kernel_launch.py benchmarks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of this can be done with the existing pytest-benchmark plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all or most of this functionality can be done with the existing pytest-benchmark plugin.

Really would like to avoid duplicating functionality, especially if it's AI generated duplication.

- `bench-launch-overhead`
- `bench`
- `benchcmp`
- `bench-against`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this script doesn't do a three way comparison, it also doesn't require writing any new code to run it.

Can we try to reuse bench-against instead of reinventing a lot of what that already does?

Comment on lines +84 to +100
def some_kernel_1():
return

@cuda.jit("void(float32[:])")
def some_kernel_2(arr1):
return

@cuda.jit("void(float32[:],float32[:])")
def some_kernel_3(arr1, arr2):
return

@cuda.jit("void(float32[:],float32[:],float32[:])")
def some_kernel_4(arr1, arr2, arr3):
return

@cuda.jit("void(float32[:],float32[:],float32[:],float32[:])")
def some_kernel_5(arr1, arr2, arr3, arr4):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are nearly identical to the existing benchmarks. Let's avoid repeating existing benchmarks and tools that run them.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 26, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@tpn tpn force-pushed the 280-launch-config-v2 branch from 378b9c4 to 7944b5d Compare January 26, 2026 18:47
@gmarkall gmarkall added the 2 - In Progress Currently a work in progress label Feb 3, 2026
@tpn tpn force-pushed the 280-launch-config-v2 branch from b45e931 to 030f095 Compare February 19, 2026 03:48
@tpn tpn force-pushed the 280-launch-config-v2 branch from 030f095 to 40c3c5c Compare February 20, 2026 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2 - In Progress Currently a work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants